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We  consider  two-player  games  played  for  an  infinite  number  of  rounds,  with  a;-regular 
winning  conditions.  The  games  may  be  concurrent,  in  that  the  players  choose  their  moves 
simultaneously  and  independently,  and  probabilistic,  in  that  the  moves  determine  a  prob¬ 
ability  distribution  for  the  successor  state.  We  introduce  quantitative  game  y-calculus, 
and  we  show  that  the  maximal  probability  of  winning  such  games  can  be  expressed  as 
the  fixpoint  formulas  in  this  calculus.  We  develop  the  arguments  both  for  deterministic 
and  for  probabilistic  concurrent  games;  as  a  special  case,  we  solve  probabilistic  turn-based 
games  with  cj-regular  winning  conditions,  which  was  also  open.  We  also  characterize  the 
optimality,  and  the  memory  requirements,  of  the  winning  strategies.  In  particular,  we 
show  that  while  memoryless  strategies  suffice  for  winning  games  with  safety  and  reaclia- 
bility  conditions,  Biichi  conditions  require  the  use  of  strategies  with  infinite  memory.  The 
existence  of  optimal  strategies,  as  opposed  to  e-optimal,  is  only  guaranteed  in  games  with 
safety  winning  conditions. 

Key  Words:  Automata,  games,  /t-calculus,  probabilistic  algorithm,  temporal 
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1.  INTRODUCTION 

We  consider  two-player  games  played  on  finite  state  spaces  for  an  infinite  num¬ 
ber  of  rounds.  In  each  round,  depending  on  the  current  state  of  the  game,  the  moves 
of  one  or  both  players  determine  the  next  state  [Sha53];  we  consider  games  in  which 
the  set  of  available  moves  is  finite.  Such  games  offer  a  model  for  systems  composed 
of  interacting  components,  and  they  have  been  studied  under  a  wide  range  of  win¬ 
ning  conditions.  The  winning  conditions  are  often  codified  by  associating  a  reward 
with  each  state  and  choice  of  moves,  and  by  studying  the  maximal  discounted,  total, 
or  average  reward  that  player  1  can  obtain  in  such  a  game;  a  survey  of  algorithms 
for  solving  games  with  respect  to  such  winning  conditions  is  e.g.  [RF91,  FV97]. 
Here,  we  consider  winning  conditions  consisting  in  w-regular  automata  acceptance 
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conditions  defined  over  the  state  space  of  the  game  [BL69,  GH82,  Tho95].  Given 
a  game  with  an  w-regular  winning  condition  and  a  starting  state  s,  we  study  the 
maximal  probability  with  which  player  1  can  ensure  that  the  condition  holds  from 
s;  we  call  this  maximal  probability  the  value  of  the  game  at  s  for  player  1.  The 
determinacy  result  of  [Mar98]  ensures  that,  at  all  states  and  for  all  w-regular  win¬ 
ning  conditions,  the  value  of  the  game  for  player  1  is  equal  to  one  minus  the  value 
of  the  game  with  complementary  condition  for  player  2. 

We  distinguish  between  turn-based  and  concurrent  games,  and  between  deter¬ 
ministic  and  probabilistic  games.  Systems  in  which  the  interaction  between  the 
components  is  asynchronous  give  rise  to  turn-based  games,  where  in  each  round 
only  one  of  the  two  players  can  choose  among  several  moves.  On  the  other  hand, 
synchronous  interaction  leads  to  concurrent  games,  where  in  each  round  both  play¬ 
ers  can  choose  simultaneously  and  independently  among  several  moves.  The  games 
are  deterministic  if  the  current  state  and  the  moves  uniquely  determine  the  suc¬ 
cessor  state,  and  are  probabilistic  if  the  current  state  and  the  moves  determine  a 
probability  distribution  for  the  successor  state.  For  any  w-regular  winning  con¬ 
dition,  the  value  of  a  deterministic  turn-based  game  at  a  state  is  either  0  or  1; 
moreover,  player  1  can  achieve  this  value  by  playing  according  to  a  deterministic 
strategy,  that  select  a  move  based  on  the  current  state  and  on  the  history  of  the 
game  [BL69,  GH82].  In  contrast,  the  value  of  a  concurrent  game  at  a  state  may 
be  strictly  between  0  and  1;  furthermore,  achieving  this  value  may  require  the  use 
of  randomized  strategies,  that  select  not  a  move,  but  a  probability  distribution 
over  moves.  To  see  this,  consider  the  concurrent  game  MatchOneBit.  The  game 
starts  at  state  so,  where  both  players  simultaneously  and  independently  choose  a  bit 
(0  or  1);  if  the  bits  match,  the  game  proceeds  to  state  s„„,  otherwise,  it  proceeds 
to  state  siose.  The  states  swin  and  siose  are  absorbing:  if  one  of  them  is  reached,  the 
game  is  confined  there  forever.  Consider  the  safety  condition  n{so,  Sum},  requiring 
that  Siose  is  not  entered.  For  every  deterministic  strategy  of  player  1,  player  2  has 
another  (complementary)  deterministic  strategy  that  ensures  a  transition  to  Siose; 
hence,  if  player  1  could  only  use  deterministic  strategies,  he  would  win  with  prob¬ 
ability  0.  However,  if  player  1  uses  a  randomized  strategy  that  chooses  both  bits 
at  random  with  uniform  probability,  then  the  game  enters  state  swin  with  proba¬ 
bility  1/2,  regardless  of  the  strategy  of  player  2;  indeed,  the  value  of  the  game  at 
so  is  1/2. 

The  value  of  deterministic  turn-based  games  with  w-regular  winning  conditions 
can  be  computed  with  the  algorithms  of  [BL69,  GH82,  EJ91,  Tho95].  The  algo¬ 
rithms  of  [EJ91]  are  based  on  the  use  of  game  /i-calculus,  obtained  by  replacing 
the  predecessor  operator  Pre  of  classical  /i-calculus  [Koz83b]  by  the  controllable 
predecessor  operator  Cpre:  for  a  set  of  states  U,  the  set  Cpre(f/)  consists  of  the 
states  from  which  player  1  can  force  the  game  into  U  in  one  step.  A  richer  version  of 
game  /i-calculus  was  used  in  [dAHOO]  to  provide  qualitative  solutions  for  concurrent 
probabilistic  games  with  w-regular  conditions.  There,  multi-argument  predecessor 
operators  are  used  to  compute  the  set  of  states  from  which  player  1  can  win  with 
probability  1,  or  arbitrarily  close  to  1. 

We  introduce  quantitative  game  p-calculus,  and  use  it  to  provide  a  uniform 
framework  for  understanding  and  solving  concurrent  games  with  w-regular  winning 
conditions.  In  quantitative  game  /f-calculus,  sets  of  states  are  replaced  by  functions 
from  states  to  the  interval  [0,1],  and  the  controllable  predecessor  operator  Cpre 
is  replaced  by  a  quantitative  version  Ppre.  Given  a  function  /  from  states  to  the 
interval  [0, 1],  the  function  g  =  Ppre(/)  associates  with  each  state  the  maximal 
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expected  value  of  /  that  player  1  can  ensure  in  one  step.  The  operator  Ppre  can 
be  evaluated  using  results  about  matrix  games  [vNM47,  Owe95].  Related  quantita¬ 
tive  predecessor  operators  for  one-player  or  turn-based  structures  were  considered 
in  [Koz83a,  MMS96,  HK97,  McI98,  MM01].  We  show  that  the  values  of  concur¬ 
rent  games  with  w-regular  conditions  can  be  obtained  simply  by  replacing  Cpre 
by  Ppre  in  the  solutions  of  [EJ91].  The  result  is  surprising  because  concurrent 
games  differ  from  turn-based  deterministic  games  in  several  fundamental  respects. 
First,  concurrent  games  require  in  general  the  use  of  randomized  strategies,  as 
remarked  above.  Second,  even  for  the  simple  winning  condition  of  reachability, 
optimal  strategies  may  not  exist:  one  can  only  guarantee  the  existence  of  £-optimal 
strategies  for  all  e  >  0  [Eve57].  Third,  whereas  finite- memory  strategies  suffice 
for  winning  deterministic  turn-based  games,  in  concurrent  games  both  e-optimal 
strategies,  and  optimal  strategies  if  they  exist,  may  need  an  infinite  amount  of 
memory  [dAHOO].  Fourth,  the  standard  recursive  structure  of  proofs  for  determin¬ 
istic  turn-based  games  [McN93,  Tho95]  breaks  down,  as  both  players  can  choose  a 
distribution  over  moves  at  each  state. 

We  develop  the  arguments  both  for  deterministic  and  for  probabilistic  concur¬ 
rent  games.  Hence,  as  a  special  case  we  solve  probabilistic  turn-based  games  with 
cu-regular  winning  conditions,  which  was  also  an  open  problem.  The  quantitative 
game  /i-calculus  solution  formulas  provide  the  value  also  of  games  with  countable, 
rather  than  finite,  state  space.  We  also  characterize  the  optimality,  and  the  memory 
requirements,  of  the  winning  strategies.  In  particular,  we  show  that  while  memo¬ 
ryless  strategies  suffice  for  winning  games  with  safety  and  reachability  conditions, 
Biichi  and  Rabin-chain  conditions  require  the  use  of  strategies  with  infinite  memory. 
The  existence  of  optimal  strategies,  as  opposed  to  £-optimal,  is  only  guaranteed  in 
games  with  safety  winning  conditions. 

The  solutions  formulas  we  present  in  this  paper  also  solve  the  model-checking 
problem  for  the  probabilistic  temporal  logics  pCTL  and  pCTL*  over  concurrent 
games.  The  logics  pCTL  and  pCTL*,  originally  proposed  over  Markov  chains 
[ASB+95]  and  Markov  decision  processes  [BdA95],  can  express  the  maximal  and 
minimal  probability  with  which  linear  time  temporal  logic  (LTL)  formulas  are  sat¬ 
isfied.  These  logics  can  be  immediately  generalized  to  concurrent  games,  by  con¬ 
sidering  the  maximal  probability  with  which  a  player  can  ensure  that  the  formula 
holds.  Since  LTL  formulas  can  be  translated  into  deterministic  Rabin-chain  au¬ 
tomata  [Saf88,  Saf92,  VW94],  our  results  characterize  the  validity  of  pCTL  and 
pCTL*  formulas  over  concurrent  games. 

As  remarked  by  [EJ91]  in  the  context  of  deterministic  turn-based  games,  the 
use  of  yu-calculus  for  solving  games  helps  in  the  formulation  of  the  correctness  ar¬ 
guments.  In  order  to  argue  the  correctness  of  a  solution  formula,  we  need  to  show 
that  player  1  has  an  optimal  (or  £-optimal)  strategy  that  realizes  the  value  given 
by  the  formula,  and  that  player  2  has  a  “spoiling”  strategy  that  is  optimal  (or  e- 
optimal)  for  the  game  with  the  complementary  condition.  Since  the  operator  Ppre 
in  the  solution  formula  refers  to  player  1,  an  optimal  strategy  for  player  1  can  be 
constructed  from  the  fixpoint  of  the  formula.  On  the  other  hand,  the  derivation 
of  spoiling  strategies  for  player  2  is  not  immediate:  indeed,  even  for  games  with 
safety  or  reachability  conditions,  the  standard  argument  involves  the  consideration 
of  discounted  versions  of  the  games  (see,  e.g.,  [FV97]).  In  contrast,  by  writing 
the  solution  formula  in  game  /^-calculus,  we  place  the  burden  of  the  argument  on 
the  syntactic  complementation  of  the  solution  formula.  Specifically,  for  a  winning 
condition  4>,  we  characterize  the  maximal  probabilities  of  winning  the  game  by  a 
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yu-calculus  formula  <p,  and  from  <f>  we  construct  an  optimal  (or  e-optimal)  strategy 
for  player  1.  The  syntactic  complement  ~>(j)  of  (f>  gives  the  maximal  probabilities 
for  player  2  to  win  the  dual  game  with  condition  — i\I/.  From  we  can  again  con¬ 
struct  an  optimal  (or  £-optimal)  strategy  for  player  2  for  the  game  with  condition 
The  two  constructions  are  enough  to  conclude  the  correctness  of  our  solution 
formulas. 

The  iterative  interpretation  of  quantitative  game  /r-calculus  leads  to  algorithms 
for  the  computation  of  approximate  solutions.  By  representing  value  functions  sym¬ 
bolically,  these  algorithms  may  be  used  for  the  approximate  analysis  of  games  with 
very  large  state  spaces  [BMCD90,  dAKN+00].  Unfortunately,  except  for  safety 
and  reachability  conditions,  the  alternation  of  least  and  greatest  fixpoint  opera¬ 
tors  in  the  solution  formulas  leads  to  approximation  schemes  that  do  not  converge 
monotonically  to  the  value  of  a  game.  This  situation  contrasts  with  the  one  for 
Markov  decision  processes,  where  monotonically-converging  approximation  schemes 
are  available,  and  where  the  maximal  winning  probability  can  be  computed  in  poly¬ 
nomial  time  by  reduction  to  linear  programming  [CY90].  We  show  that  this  dis¬ 
crepancy  is  no  accident,  since  the  basic  device  for  solving  Markov  decision  processes 
with  w-regular  conditions,  viz.,  a  reduction  to  reachability,  fails  for  games. 

2.  CONCURRENT  GAMES 

For  a  countable  set  A,  a  probability  distribution  on  A  is  a  function  p:  A  t->  [0, 1] 
such  that  J2aeAP(a)  =  1-  We  denote  the  set  of  probability  distributions  on  A  by 
T>(A).  A  (two-player)  concurrent  game  structure  Q  =  {S,  Moves,  T\,T2,p)  consists 
of  the  following  components: 

•  A  finite  state  space  5. 

•  A  finite  set  Moves  of  moves. 

•  Two  move  assignments  Ti,  T2  :  5  h->-  2^oves  \  0.  For  i  £  {1,  2},  assignment  T* 
associates  with  each  state  s  £  5  the  non-empty  set  Tj(s)  C  Moves  of  moves 
available  to  player  i  at  state  s. 

•  A  probabilistic  transition  function  p,  that  gives  the  probability  p(t  |  s,  01,02) 
of  a  transition  from  s  to  t  for  all  s,t  €  S  and  all  moves  ai  £  Ti(s)  and 

02  e  r2(s). 

At  every  state  s  £  S,  player  1  chooses  a  move  oi  £  ri(s),  and  simultaneously  and 
independently  player  2  chooses  a  move  02  £  T2(s).  The  game  then  proceeds  to  the 
successor  state  t  with  probability  p(t  \  s,  a±,  02),  for  all  t  £  S.  We  assume  that  the 
players  act  non-cooperatively,  i.e.,  each  player  chooses  her  strategy  independently 
and  secretly  from  the  other  player,  and  is  only  interested  in  maximizing  her  own 
reward.  A  path  of  Q  is  an  infinite  sequence  s  =  so,  si, S2,  ■  ■  ■  of  states  in  5  such 
that  for  all  k  >  0,  there  are  moves  a\  £  Ti^)  and  a§  £  T2(sfc)  with  p(s*,+i  | 
si,,  af,  02)  >  0.  We  denote  by  fi  the  set  of  all  paths. 

We  distinguish  the  following  special  classes  of  concurrent  game  structures. 

•  A  concurrent  game  structure  Q  is  deterministic  if  for  all  s  £  S  and  all  oq  £ 
ri(s),  02  £  T2(s),  there  is  a  t  £  S  such  that  p(t  |  s,  01,02)  =  1. 
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•  A  concurrent  game  structure  Q  is  turn-based  if  at  every  state  at  most  one 
player  can  choose  among  multiple  moves;  that  is,  if  for  every  state  s  £  S 
there  exists  at  most  one  i  £  {1,2}  with  |r,(s)|  >  1. 

For  brevity,  we  refer  to  concurrent  turn-based  game  structures  simply  as  turn-based 
game  structures. 


2.1.  Randomized  strategies 

A  strategy  for  player  i  £  {1, 2}  is  a  mapping  m :  S+  h*  V(Moves)  that  associates 
with  every  nonempty  finite  sequence  a  £  S+  of  states,  representing  the  past  history 
of  the  game,  a  probability  distribution  7Ti(<t)  used  to  select  the  next  move.  Thus, 
the  choice  of  the  next  move  can  be  history-dependent  and  randomized.  The  strategy 
7 u  can  prescribe  only  moves  that  are  available  to  player  i:  that  is,  for  all  sequences 
a  £  S*  and  states  s  £  S,  we  require  that  i Ti(as)(a)  >  0  iff  a  £  Tfis).  We  denote 
by  Ilj  the  set  of  all  strategies  for  player  i  £  {1, 2}.  A  strategy  n  is  deterministic  if 
for  all  a  £  S+  there  exists  a  £  Moves  such  that  7r(a)(a)  =  1.  Thus,  deterministic 
strategies  are  equivalent  to  functions  S+  i->  Moves.  A  strategy  n  is  finite-memory 
if  the  distribution  chosen  at  every  state  s  £  S  depends  only  on  s  itself,  and  on  a 
finite  number  of  bits  of  information  about  the  past  history  of  the  game.  A  strategy 
7 r  is  memoryless  if  n(as)  =  n(s)  for  all  s  £  S  and  all  a  £  S*. 

Once  the  starting  state  s  and  the  strategies  7Ti  and  7T2  for  the  two  players  have 
been  chosen,  the  game  is  reduced  to  an  ordinary  stochastic  process.  Hence,  the 
probabilities  of  events  are  uniquely  defined,  where  an  event  A  C  0  is  a  measurable 
set  of  paths2.  For  an  event  A  C  H,  we  denote  by  Pr ™1,7r2(A)  the  probability  that  a 
path  belongs  to  A  when  the  game  starts  from  s  and  the  players  use  the  strategies  n± 
and  7T2  •  Similarly,  for  a  measurable  function  /  that  associates  a  number  in  EtU  {oo} 
with  each  path,  we  denote  by  E ^1,7T2{f}  the  expected  value  of  /  when  the  game 
starts  from  s  and  the  strategies  7Ti  and  7T2  are  used.  We  denote  by  0*  the  random 
variable  representing  the  i-th  state  of  a  path;  formally,  0*  is  a  variable  that  assumes 
value  Si  on  the  path  so,  «i,  S2,  ■  ■  ■  ■ 


2.2.  Winning  conditions 

Given  a  concurrent  game  structure  Q  =  (S,  Moves,  Ti,  T2,p),  we  consider  win¬ 
ning  conditions  expressed  by  linear-time  temporal  logic  (LTL)  formulas,  whose 
atomic  propositions  correspond  to  subsets  of  the  set  5  of  states  [MP91].  We  focus 
on  winning  conditions  that  correspond  to  safety  or  reachability  properties,  as  well 
as  winning  conditions  that  correspond  to  the  accepting  criteria  of  Biichi,  co-Biichi, 
and  Rabin-chain  automata  [Mos84,  EJ91].  We  call  games  with  such  winning  con¬ 
ditions  safety,  reachability,  Biichi,  co-Biichi,  and  Rabin-chain  games,  respectively. 
The  ability  to  solve  games  with  Rabin-chain  conditions  suffices  for  solving  games 
with  arbitrary  LTL  (or  w-regular)  winning  conditions:  in  fact,  it  suffices  to  en¬ 
code  the  w-regular  condition  as  a  deterministic  Rabin-chain  automaton,  solving 
then  the  game  consisting  of  the  synchronous  product  of  the  original  game  with  the 
Rabin-chain  automaton  [Mos84,  Tho95]. 

Given  an  LTL  winning  condition  41,  by  abuse  of  notation  we  denote  equally  by 
41  the  set  of  paths  s  £  fi  that  satisfy  41;  this  set  is  measurable  for  any  choice  of 
strategies  for  the  two  players  [Var85].  Hence,  the  probability  that  a  path  satisfies  41 

2To  be  precise,  we  should  define  events  as  measurable  sets  of  paths  sharing  the  same  initial 
state.  However,  our  (slightly)  improper  definition  leads  to  more  concise  notation. 
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starting  from  state  s  £  S  under  strategies  7Ti,7T2  for  the  two  players  is  pr£1,7r2(\|)). 
Given  a  state  s  £  S  and  a  winning  condition  \I>,  we  are  interested  in  finding  the 
maximal  probability  with  which  player  i  £  {1,2}  can  ensure  that  if1  holds  from  s. 
We  call  such  probability  the  value  of  the  game  at  s  for  player  i  £  {1, 2}.  This 
value  for  player  1  is  given  by  the  function  (1)\I>  :  S  h->  [0, 1],  defined  for  all  s  £  S  by 

<1)$(«)  =  sup  inf  Pi?'*3  {9). 

7ri erii  K2en2 


The  value  for  player  2  is  given  by  the  function  (2)\l>,  defined  symmetrically.  Con¬ 
current  games  satisfy  a  quantitative  version  of  determinacy  [Mar98],  stating  that 
for  all  LTL  conditions  and  all  s  £  S,  we  have 

<!>*(«)  =  1  -  <2>-*(«). 

A  strategy  7Ti  for  player  1  is  optimal  if  for  all  s  £  S  we  have 

inf  PrJ1’772  =  (l)^(s). 

7T2GII2 

For  £  >  0,  a  strategy  7Ti  for  player  1  is  e-optimal  if  for  all  s  £  S  we  have 

inf  Pr^1’772  >  (l)$(s)  -£. 

7T2GII2 

We  define  optimal  and  £-optimal  strategies  for  player  2  symmetrically.  Note  that 
the  quantitative  determinacy  of  concurrent  games  is  equivalent  to  the  existence  of 
£-optimal  strategies  for  both  players  for  all  e  >  0  at  all  states  s  £  S.  For  the 
special  case  of  deterministic  turn-based  games,  it  is  known  that  the  value  of  any 
w-regular  game  at  a  state  is  either  0  or  1,  and  finite-memory  deterministic  optimal 
strategies  always  exist;  the  value  of  the  game  can  be  computed  with  the  algorithms 
of  [BL69,  GH82,  EJ9lj. 


2.3.  Predecessor  operators 

Let  T  be  the  space  of  all  functions  5  [0, 1]  that  map  states  into  the  interval 

[0, 1].  Given  two  functions  f,g£d7,  we  write  f  >  g  (resp.  /  >  g)  if  f(s)  >  g(s) 
(resp.  f(s)  >  g(s))  at  all  s  £  S,  and  we  define  /  A  g  and  /  V  g  by 

(/A«)(«)  =min  {f(s),g(s)} 

(/V«)(«)  =ma x{f(s),g(s)} 

for  all  s  £  S.  For  f,g  £  J7,  we  use  the  notation  \f  —  g\  =  maxs£s 
We  denote  by  0  and  1  the  constant  functions  that  map  all  states  into  0  and  1, 
respectively.  For  all  /  £  J7,  we  denote  by  1  —  /  the  function  defined  by  (1  —  f)(s)  = 
1  —  f(s)  for  all  s  £  S.  Given  a  subset  Q  C  5  of  states,  by  abuse  of  notation  we 
denote  also  by  Q  the  indicator  function  of  Q,  defined  by  Q(s)  =  1  if  s  £  Q  and 
Q(s )  =0  otherwise.  We  denote  by  -i Q  =  S  \  Q  the  complement  of  the  subset  Q 
in  S,  and  again  we  denote  equally  by  -i Q  the  indicator  function  of  -<Q.  We  denote 
by  Fi  C  F  the  set  of  indicator  functions.  The  quantitative  predecessor  operators 
Pprel5 Ppre2  :  T  •->  J7  are  defined  for  every  f  £  J7  by 

Ppre, (/)(«)  =  sup  inf  E^{/(01)} 

nielli  7T2G112 
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and  symmetrically  for  Ppre2.  Intuitively,  the  value  Ppr ei(/)(s)  is  the  maximum 
expectation  for  the  next  value  of  /  that  player  i  £  {1, 2}  can  achieve.  Given  /  £  F 
and  i  £  {1,2},  the  function  Ppr e^/)  can  be  computed  by  solving  the  following 
matrix  game  at  each  s  £  S: 

Pprei (/)(«)  =vah\YJf(t)p(t  |  «, ai,a2)|  , 

1“  J  oier1(s)Jo2er2(s) 

where  valiA  denotes  the  value  obtained  by  player  1  in  the  matrix  game  A.  The 
existence  of  solutions  to  the  above  matrix  games,  and  the  existence  of  optimal 
randomized  strategies  for  players  1  and  2,  is  guaranteed  by  the  minmax  theorem 
[vNM47].  The  matrix  games  may  be  solved  using  traditional  linear  programming 
algorithms  (see,  e.g.,  [Owe95]).  From  properties  of  matrix  games  we  have  the 
following  facts. 

Proposition  1. 

1.  For  i  £  {1,2},  the  operator  Ppret  is  monotonic  and  continuous,  that  is,  for 
all  f,g  £  F,  if  f  >  g  then  Ppre^f)  >  Ppre^g);  and  for  all  f\  <  f 2  <■■  ■  in 
F,  we  have  lim„  Ppre^fn)  =  Ppret (lim„  /„). 

2.  For  all  f,g  £  F  and  all  i  £  {1,2},  we  have  \ Ppre^f)  —  Pprei(g)\  <  | /  —  g\. 

3.  The  operators  Pprei  and  Pp re2  are  dual:  for  all  f  £  F,  we  have  Ppre1(/)  = 
1  -  Ppre2(  1  -  /). 


2.4.  Quantitative  game  //-calctdus 

We  write  the  solutions  of  games  with  respect  to  w-regular  winning  conditions 
in  quantitative  game  p,-calculus.  The  formulas  of  the  quantitative  game  /i-calculus 
are  generated  by  the  grammar 


f>  ::=  Q  \  x  \  (fV  (j)  \  (j>  /\  (f>  \  Ppre1(^)  |  Ppre2(^)  |  px.cf)  |  ox.cj),  (1) 


for  proposition  Q  C  S  and  variables  x  from  some  fixed  set  X.  Hence,  as  for  LTL, 
the  propositions  of  quantitative  /i-calculus  formulas  correspond  to  subsets  of  states 
of  the  game.  As  usual,  a  formula  <f>  is  closed  if  every  variable  x  in  f>  occurs  in  the 
scope  of  a  fixpoint  quantifier  / ix  or  ox. 

Let  8  :  X  i-»  T  be  a  variable  valuation  that  associates  a  function  £(x)  £  T  with 
each  variable  x  £  X.  We  write  8 [x  /]  for  the  valuation  that  agrees  with  8  on 
all  variables,  except  that  x  £  X  is  mapped  to  /  £  T.  Given  a  valuation  8,  every 
formula  <j>  of  quantitative  game  yu-calculus  defines  a  function  [dijf  £  F: 


Ifjs  =  f 

\x\e 

[Pprei  (^)k  =  Pprei  (Mf) 

[Ppre2(^)]£  =Ppre2([^]f) 

iMMe  =  (I(/,i]4'{a}[02]£) 


The  existence  and  uniqueness  of  the  above  fixpoints  for  the  p,  and  v  operators  is  a 
consequence  of  the  monotonicity  and  continuity  of  all  the  operators,  and  in  partic¬ 
ular  of  Pprei  an<l  PPre2-  As  usual,  the  fixpoints  can  be  evaluated  in  an  iterative 
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fashion:  we  have  =  limn-x^x,,,  where  xo  =  0,  and  xn+i  =  for 

n  >  0.  Similarly,  for  the  greatest  fixpoint  operator  v  we  have  \vx.(f\s  =  lim,woc  xn, 
where  xo  =  1,  and  xn+i  =  [0]f[a.1_>a.n]  for  n  >  0.  Moreover,  for  a  closed  //-calculus 
formula  <f>.  the  function  is  independent  of  the  valuation  £.  and  hence  we  write 
[</>]  to  denote  \(f\s  for  some  £ .  We  note  that  the  solution  algorithms  presented  in 
this  paper  apply  also  to  games  with  countable  (rather  than  finite)  state  space  and 
finite  set  of  moves  (see  Theorem  4);  in  this  case,  however,  the  iterative  evaluation 
of  the  fixpoints  needs  to  be  based  on  transfinite  induction. 

The  quantitative  game  //-calculus  defined  by  (1)  suffices  for  writing  the  solution 
formulas  of  games  with  w-regular  winning  conditions.  In  intermediate  lemmas  and 
proofs,  however,  we  use  with  slight  abuse  of  notation  an  extended  version  of  the 
calculus,  in  which  we  have  one  symbol  /  for  every  function  /  £  T.  Obviously,  such 
functions  are  interpreted  as  themselves:  for  all  valuations  £ ,  we  have  [/]f  =  /. 

2.5.  Complementation  and  correctness 

We  solve  concurrent  games  with  LTL  winning  condition  4/  by  providing  a  quan¬ 
titative  game  //-calculus  formula  <j>  such  that  (1)41  =  [</>].  To  prove  this  equality,  we 
exploit  the  complementation  of  //-calculus  expressions.  The  complement  of  a  closed 
//-calculus  formula  <f>  is  a  formula  -aj>  such  that  1  —  [</>]  =  |[— >0];  the  complement 
can  be  obtained  by  recursively  applying  the  following  transformations,  which  rely 
on  the  duality  of  Ppre1  and  Ppre2 : 


~^Q 

=>S\Q 

— i  — 1<^» 

=> 4> 

-.(Pprei  (</>)) 

=>  Ppre2(-i^) 

“■  (Ppre2  (0) ) 

=>  Pprei  h<P) 

—■(0i  v  4> 2) 

=>  (-1^1)  A  (1^2) 

-■(^l  A  (f> 2) 

=>  (-1^1)  V  (1^2) 

~^px.(f> 

vx.^4>[-^x/x] 

-i  nx.cj) 

px.^(j>[-^x/x\ 

where  </>[- <x/x]  denotes  the  result  of  replacing  every  free  occurrence  of  x  in  (j)  with 
-i£.  Note  that  given  a  closed  formula  (j>  defined  by  grammar  (1),  by  applying  the 
above  transformations  to  -i <f>  we  obtain  again  a  closed  formula  defined  by  grammar 
(1).  In  fact,  the  above  transformations  push  the  -i  operator  to  the  leaves  of  the 
syntax  tree  (1),  which  consist  either  in  subsets  Q  C  S  or  in  variables  x  £  X. 
The  subsets  are  simply  complemented.  Since  <j>  is  closed,  each  variable  x  £  X  in 
<f>  appears  in  the  scope  of  a  px  or  ox  quantifier;  the  transformation  rules  for  // 
and  v,  together  with  the  rule  for  double  negation  elimination,  ensure  that  once  all 
transformations  have  been  applied,  no  -i  operator  remains  as  prefix  to  a  variable. 

Our  proofs  of  (1)4/  =  [<"/)]  consist  in  two  steps. 

•  First,  from  <f»  we  construct  for  all  e  >  0  a  strategy  nf  for  player  1  that  ensures 
winning  with  probability  at  least  \<p\  —  e,  proving  [c6]  >  (1)4/. 

•  Second,  we  complement  <f>,  and  we  consider  the  winning  condition  — '4/.  From 
-i <f>  we  construct  for  all  e  >  0  a  strategy  7 r|  that  enables  player  2  to  win  the 
game  with  goal  —>4/  with  probability  at  least  [-h(>]  —  e;  this  shows  [-i^]  > 
(2)— 14>,  or  equivalently  [<-/)]  <  (1)4/. 
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Even  in  the  cases  where  solution  formulas  for  concurrent  games  are  known,  such  as 
for  the  reachability  winning  condition  (see  e.g.  [FV97],  Chapter  4.4),  this  approach 
yields  simpler  arguments  than  the  classical  one,  where  the  £-optimal  strategies  for 
both  players  have  to  be  constructed  from  the  solution  formula  <j>  for  player  1  alone, 
and  where  it  is  usually  necessary  to  consider  discounted  versions  of  the  games. 


3.  REACHABILITY  AND  SAFETY  GAMES 


Concurrent  reachability  and  safety  games  can  be  solved  by  reducing  them  to 
positive  stochastic  games  [TV87,  FV97].  We  present  the  solution  algorithms,  refor¬ 
mulating  them  in  quantitative  game  /<-calculus.  As  mentioned  in  the  introduction, 
by  relying  on  the  complementation  of  quantitative  game  /i-calculus,  we  are  able 
to  prove  the  correctness  of  the  solutions  without  resorting  to  the  consideration  of 
discounted  versions  of  the  same  games. 

A  concurrent  reachability  game  consists  of  a  concurrent  game  structure  Q  = 
(S,  Moves,  Ti,T2,p)  together  with  a  winning  condition  O U,  where  U  C  S.  In¬ 
tuitively,  the  winning  condition  consists  in  reaching  the  subset  U  of  states.  The 
solution  of  such  a  reachability  game  is  given  by 


(1)01/  =  lpx.{U  V  Ppre1(x))]. 


(2) 


This  solution  can  be  computed  iteratively  as  the  limit  (1)0//  =  limi,-^  Xk,  where 
xo  =  0  and  Xk+i  =  U  V  Ppre^xj,)  for  k  >  0.  This  iteration  scheme  gives  an 
approximation  scheme  to  solve  the  reachability  game.  In  Markov  decision  processes, 
one  can  reduce  the  reachability  question  to  a  linear  programming  problem  which  can 
then  be  solved  exactly.  This  gives  an  alternative  to  value  iteration.  Unfortunately, 
for  concurrent  games  we  cannot  reduce  the  problem  to  linear  programming,  because 
the  maximal  probability  of  winning  in  a  game  where  all  probabilities  are  rationals 
may  still  be  irrational  (see  e.g.  [RF91]). 

Example  1.  Consider  a  concurrent  game  with  three  states  s,  t,  and  u,  and 
winning  condition  0{w}.  The  transition  relation  is  as  follows:  from  state  t,  player  1 
has  two  choices  oq  and  b\,  and  player  2  the  choices  02  and  62-  The  transi¬ 
tion  probabilities  are:  Pr(w|f,  oi,  02)  =  Pr(f|f,  oi,  02)  =  Pr(u\t,  61,  02)  = 
Pr(u|f,  ox,  62)  =  0,  Pr(s|t,  61,02)  =  Pr(s|f,  oi,  62)  =  1,  Pr(u|t,  61, 62)  =  §,  and 
Pr(£|£,  61, 62)  =  The  states  s  and  u  are  absorbing:  the  game  never  leaves  s  or  u 
once  it  reaches  these  states.  The  maximal  probability  of  winning  the  game  0{u} 
is  given  by  the  least  fixpoint  of  x  =  Pprex  (x)  V  {u};  for  state  t,  we  have 


x{t )  =  vali 


i  +  \x(t)  0 

0  |  +  \x{t)  _ 


which  has  the  solution  x(t )  =  (—3  +  2\/6)/5.  I 


To  prove  (2),  we  show  separately  the  two  inequalities 


<1  )OU  >  \px.(U  V  Pprei(x))] 
(1)0 U  <  \nx.(UW  Pprei (*))]. 


The  first  inequality  is  a  consequence  of  the  following  lemma;  the  second  inequality, 
as  mentioned  in  Section  2.5,  will  follow  from  results  on  safety  games. 
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Lemma  1.  Let  w  =  \px.(U  V  Ppre1(x))J.  For  all  s  >  0  player  1  has  a  strategy 
tt{  such  that  Prg1,7r2(0 U)  >  w(s)  —  e  for  all  7r2  €  II2  and  all  s  €  S. 

Proof.  The  proof  follows  a  classical  argument  (see,  e.g.,  [Eve57,  FV97]).  For  n  > 
0,  consider  the  n-step  version  of  the  game,  whose  winning  condition  OnU  requires 
reaching  U  in  at  most  n  steps.  We  construct  inductively  a  sequence  {7r"}„>0  of 
strategies  for  player  1.  Let  xq  —  0  and  x/.+i  =  U  V  Ppre1(xfc)  for  k  >  0.  Strategy 
7T°  is  chosen  arbitrarily.  For  n  >  0  and  s  €  5,  the  distribution  7r"+1  (s)  corresponds 
to  an  optimal  distribution  over  ri(s)  in  the  matrix  game  for  Ppre1(x„)  at  s.  For 
n  >  0,  s  €  S,  and  a  G  S+ ,  we  let  7t"+1(sct)  =  7r”(cj).  We  show  by  induction  on  n 
that  for  all  strategies  7 r2  for  player  2,  and  for  all  s  £  S,  we  have  Pr^1 ,7r2{0 nU}  >  xn. 
For  n  =  0,  the  result  is  immediate;  the  result  is  also  immediate  for  s  €  U.  For  n  >  0 
and  s  $  U,  we  have 

Pr« "+1,?r2 {■0„_|_1C/}  >  ^Prf ’7r2[t]{<O„f/}Prf+1’"2(0i  =t) 
tes 

>  ^2 Xn (*) Pr* 1  ,7r2  (01  =  ^ 

tes 

>  Ppre1(a:n)(s)  =  x„+i(s), 

where  7t2  [t]  is  the  strategy  that  behaves  like  7t2  after  a  transition  to  t  has  oc¬ 
curred.  The  lemma  then  follows  from  w  =  lim,,-^  xn,  and  from  the  fact  that 
O nU  implies  OU  for  all  n  >  0.  In  fact,  given  any  e  >  0,  there  is  n  >  0  such  that 
max{x(s)  —  xn(s)  \  s  £  S'}  <  e.  When  player  1  uses  strategy  7r"  we  have,  for  all 
strategies  7t2  of  player  2,  Pr^1 ,7r2(Of/)  >  Pr71"1 ,7r2(0 nU)  >  xn  >  w  -  s.  I 

A  concurrent  safety  game  consists  of  a  concurrent  game  structure  Q  = 
(S,  Moves,  Ti,  r2,  p)  together  with  a  winning  condition  DU.  where  U  C  S.  In¬ 
tuitively,  the  winning  condition  consists  in  staying  forever  in  the  subset  U  of  states. 
The  complement  of  the  reachability  condition  OU  is  the  safety  condition  □->[/,  and 
the  complement  of  the  quantitative  game  yu-calculus  formula  px.(U  V  Ppre1(x))  is 

ux.(-i U  A  Ppre2(x)), 

where  ->U  is  an  abbreviation  for  S\U.  We  will  show  that  the  solution  of  concurrent 
safety  games  is  given  by 


<1)DI/ =  [«/*.(£/ A  Ppre, (*))],  (3) 

which  is  dual  to  (2).  To  this  end,  we  prove  the  following  lemma. 

Lemma  2.  Let  w  =  \vx.(U  A  Ppre1(x))].  Player  1  has  a  strategy  7Ti  such  that 
Pr^'^iDU)  >  w(s)  for  all  7t2  €  II2  and  all  s  €  S. 

The  lemma  can  be  proved  using  standard  arguments  about  positive  reward  games 
[FV97].  We  present  here  a  more  direct  proof,  that  will  lead  to  the  arguments  for 
Buchi  and  co-Buchi  games. 

Proof.  Let  7Ti  be  a  memoryless  strategy  for  player  1  that  at  all  s  €  U 
plays  according  to  an  optimal  distribution  of  the  matrix  game  corresponding  to 
Ppre1(rc)(s),  and  at  all  s  £  S  \  U  plays  arbitrarily.  Fix  a  state  so  €  •S'  and  an 
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arbitrary  strategy  7T2  £  II2.  The  process  {Hn}n>0  defined  by  Hn  =  w(Qn)  is  a 
submartingale  [Wil91]:  in  fact,  from  w(s)  =  Ppre1(w)(s)  for  s  £  U  and  from  the 
optimality  of  7Ti  follows  that 

E ^{Hn+1  |  H0,  Hi, . ,  Hn}  >  Hn 

for  all  n  >  0.  Hence,  we  have  E^01,7r2{JJ„}  >  Ho  =  w(so).  Moreover,  since 
w(s)  <  1  at  all  s  £  S  and  w(s)  =  0  at  s  £  S  \  U.  by  inspection  we  have 
Ego1,7r2{fJrl}  <  PrJo1,7r2  (□„{/),  where  □„[/  is  the  event  of  staying  in  U  for  at  least  n 
steps.  Combining  these  two  inequalities  we  obtain  w(so)  <  Prgo1,7r2(n„C/),  and  the 
result  follows  from  Pr^o1,7r2 (□[/)  =  lim,,-^  Prgo1,7r2(n„{7).  I 

The  following  theorem  summarizes  the  properties  of  concurrent  reachability  and 
safety  games. 

Theorem  1.  The  following  assertions  hold. 

1.  Concurrent  reachability  and  safety  games  can  be  solved  according  to  (2)  and 
(3). 

2.  Concurrent  reachability  games  have  memoryless  e-optimal  strategies;  there 
are  deterministic  concurrent  reachability  games  without  optimal  strategies. 

3.  Concurrent  safety  games  have  memoryless  optimal  strategies;  there  are  deter¬ 
ministic  concurrent  safety  games  without  memoryless  deterministic  optimal 
strategies. 

Part  1  is  classical  [Eve57,  FV97],  except  for  the  notation;  the  result  also  follows  from 
the  combination  of  Lemmas  1  and  2.  The  existence  of  memoryless  e-optimal  strate¬ 
gies  for  concurrent  reachability  games  follows  from  results  on  positive  stochastic 
games  (see,  e.g.,  [FV97],  pp.  196).  The  proof  of  Lemma  1  constructs  an  e-optimal 
strategy  for  player  1,  but  the  strategy  is  in  general  not  memoryless.  The  exis¬ 
tence  of  deterministic  concurrent  reachability  games  without  optimal  strategies  is 
demonstrated  by  Example  2  below,  adapted  from  [Eve57,  KS81].  The  existence 
of  memoryless  optimal  strategies  for  concurrent  safety  games  is  classical;  it  also 
follows  from  the  proof  of  Lemma  2.  The  existence  of  deterministic  concurrent 
safety  games  without  optimal  deterministic  strategies  is  demonstrated  by  the  game 
MatchOneBit  described  in  the  introduction:  in  fact,  randomized  strategies  are 
necessary  for  one-step  matrix  games  [Owe95]. 

Example  2.  Consider  the  following  game,  adapted  from  [Eve57,  KS81]  (see 
also  [dAHK98]  for  an  intuitive  interpretation  of  the  game).  The  state  space  of  the 
game  is  S'  =  {s,  t,  u };  the  only  state  where  players  can  choose  among  more  than  one 
moveiss.  WehaveFi(s)  =  {a,  b},  and  T2(s)  =  {c,d}.  The  game  has  a  deterministic 
transition  function:  p(s  |  s,  a,  c )  =  p(t  \  s,  a,  d )  =  p(t  \  s,  b,  c)  =  p(u  \  s,  b,  d)  =  1,  all 
other  transition  probabilities  are  0.  We  have  (l)0{t}(s)  =  1.  In  fact,  player  1  can 
play  moves  a  and  b  with  probability  1  —  £  and  e  respectively  to  ensure  a  winning 
probability  of  (1  —  s)  from  s,  for  e  >  0.  However,  player  1  has  no  optimal  strategy: 
if  he  decides  to  play  move  b  at  the  nth  round,  player  2  can  play  move  d  at  the  n-th 
round,  so  that  the  probability  of  reaching  t  is  always  less  than  1.  I 
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4.  BUCHI  AND  CO-BUCHI  GAMES 


A  concurrent  Biichi  game  consists  of  a  concurrent  game  structure  Q  = 
(S,  Moves,  Ti,T2,p)  together  with  a  winning  condition  DOU,  where  U  C  S.  Intu¬ 
itively,  the  winning  condition  consists  in  visiting  the  subset  U  of  states  infinitely 
often.  The  solution  of  a  concurrent  Biichi  game  is  given  by 

(l)nof/  =  [ vy.px.((-’U  A  Ppre^x))  V  (U  A  Ppre^j/)))]  .  (4) 

The  proof  of  (4)  is  based  on  two  lemmas.  The  first  lemma  generalizes  the  result 
about  concurrent  reachability  games.  Given  a  function  g  £  T  and  a  subset  U 
of  states,  we  let  g(OU)  be  the  random  variable  that  associates  with  each  path 
«o> j  »2 ,  ■  ■  -  the  value  g(st),  for  i  =  min{fe  |  s*,  £  U}  <  oo,  and  the  value  0  if 
Sk  $  U  for  all  k  >  0.  Hence,  g(OU)  is  the  value  of  g  at  the  state  where  the  path 
first  enters  U,  if  such  a  state  exists,  and  is  0  otherwise.  The  following  lemma  can 
be  proved  similarly  to  Lemma  1. 

Lemma  3.  For  g  £  T  and  U  C  S,  let 

w  =  \px.({->U  A  Ppre^x))  V  (U  A  3))]. 

Then,  for  alls  >  0  player  1  has  a  strategy  n\  that  ensures  EJ1 ,7T2 {g(OU)}  >  w(s )  —  s 
at  all  s  £  S. 

We  call  the  above  game  a  <?(Ot/)-game;  the  strategy  nf  is  an  £-optimal  strategy 
for  it.  The  following  lemma  shows  that  the  fixpoint  (4)  is  a  lower  bound  for  the 
maximal  probability  of  winning  a  concurrent  Biichi  game.  The  upper-bound  result 
will  follow  from  results  on  concurrent  co-Biichi  games. 

Lemma  4.  Let 

w  =  [vy.px. ((-i?7  A  Ppre1(x))  V  (U  A  Ppre1(y)))j. 

For  all  e  >  0  player  1  has  a  strategy  i\ rf  such  that  Pr^’^DOU)  >  w(s)  —  s  for  all 
7T2  €  n2  and  all  s  £  S. 

Proof.  From  e,  construct  a  positive  sequence  {£«}«> 0  whh  £i  <  £-  The 
strategy  7rf  is  as  follows.  In  S  \  U  the  strategy  7rf  initially  coincides  with  a  £0- 
optimal  strategy  for  the  game  w(OU).  Upon  reaching  U,  the  strategy  7rf  plays 
according  to  an  optimal  distribution  of  the  matrix  game  corresponding  Ppre1(«;), 
until  U  is  left.  In  the  following  -if/-phase,  7rf  coincides  with  a  £i-optimal  strategy 
for  the  game  w(0 U);  and  so  forth.  Fix  a  state  so  £  S  and  a  strategy  7 r2  £  n2. 
Define  the  process  {Hn}n>0,  where  Hn  is  the  value  of  w  at  the  n-th  visit  of  U. 

From  Lemma  3  and  from  the  construction  of  7rf ,  we  have  E^01,7r2 {Hi}  >  w(s 0)  —  £0, 
and  for  n  >  0, 

E  f^2{Hn+1  |  H1,H2,...,Hn}  >  Hn  -  £„. 

By  taking  expectations  on  both  sides,  and  by  induction,  this  leads  to 

>  w(s0)  -  EfcO£* 

for  all  n  >  0.  Denoting  by  [DOj^n.U  the  event  of  visiting  U  at  least  n  times,  we 
have  Pr^’7r2([DO]>„U)  >  E ^’n2{Hn}.  Combining  these  two  results  we  obtain 

Prjp2 ([□<>]>„[/)  >  w(s0 )  —  £, 
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and  the  result  then  follows  from 


lim  Pr^1 ,7F2 

n- s-oo  80 


([□o]>„c/)  =  PC^(noc/). 


A  concurrent  co-Buchi  game  consists  of  a  concurrent  game  structure  Q  — 
{S,  Moves,  Ti,  r2,p)  together  with  a  winning  condition  ODD,  where  U  C  5.  Intu¬ 
itively,  the  winning  condition  consists  in  eventually  staying  forever  in  the  subset  U 
of  states.  The  solution  of  a  concurrent  co-Buchi  game  is  given  by 

(l)ODt/  =  [ jax.vy.((~'U  APpre^x))  V  ( U  A  Pprej (y)))]  .  (5) 

Again,  the  proof  of  the  above  fixpoint  equation  is  based  on  two  lemmas.  The  first 
lemma  generalizes  Lemma  2. 

Lemma  5.  For  g  £  T  and  U  C  S,  let 

w  =  [uy.((U  A  Ppre1(y))  V  (-. U  A  g))j. 

Then  the  strategy  7Ti  of  player  1  that  plays  at  each  s  £  S  according  to  an  op¬ 
timal  distribution  of  the  matrix  game  corresponding  to  Ppre1(w)(s)  is  such  that 
Prg1,7r2(nC/)  +  Eg1,,r2{(?(0-'t/)}  >w  for  all  s  £  S  and  tt2  €  II2. 

The  proof  is  similar  to  that  of  Lemma  2.  The  following  lemma  shows  that  the  fix- 
point  of  (5)  is  a  lower  bound  for  the  maximal  probability  of  winning  the  concurrent 
co-Biichi  game. 

Lemma  6.  Let 

w  =  \px.vy.((-^U  A  Ppre1(x))  V  (JJ  A  Ppre1(y)))j. 

For  all  s  >  0  player  1  has  a  strategy  7if  such  that  Pr^1,7r2(ODi7)  >  w(s )  —  e  for  all 
7t2  £  n2  and  all  s  £  S. 

Proof.  Denote  by  [Om]<„17  the  event  of  visiting  -<U  at  most  n  times.  Let 
xo  =  0,  and  for  n  >  0, 

xn  =  \vy.{{rU  A  Ppre1(x„_1))  V  ( V  A  Pprex  (y)))]. 

By  induction  on  n  >  0,  we  show  that  player  1  has  a  strategy  7rf  such  that 
PrJ1  l7r2([OD]<„t/)  >  xn(s)  for  all  s  £  S  and  all  7t2  £  II2.  The  base  case  is  trivial. 
For  n  >  0,  the  strategy  n[l  plays  according  to  an  optimal  distribution  of  the  matrix 
game  corresponding  to  Ppre1(x„)  as  long  as  U  is  not  left.  At  the  first  visit  to  -if/, 
the  strategy  7r"  plays  one  round  according  to  an  optimal  distribution  of  the  matrix 
game  corresponding  to  Ppre1(x„_i),  and  switches  thereafter  to  the  strategy  7r"_1. 
By  induction  hypothesis,  we  have  that 

Pr^’^QOD]^-! U)  >  xn-x{t)  (6) 

for  all  strategies  7t2  of  player  2  and  all  t  £  S.  By  construction  of  7 r",  together  with 
Lemma  5,  we  have 

PrJ? ■**(□£/)  +E^’7r2{Ppre1(x„_1)(0-C/)}  >  xn(s) 
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which  together  with  (6)  yields 

Prf ™{UU)  +  E<’7r2{Ppre1(At.Pri7rrl^([0n]<„-iC/))(0-C/)}  >  xn{s),  (7) 

7r”-1  7 r' 

for  all  tt2 ,  where  Af.Prj1  ’  2([OD]<„-if/)  is  the  usual  A-calculus  notation  for  the 

n  —  1  / 

function  that  maps  each  t  £  5  to  Pr^1  ,7r2([OD]<n-i^)-  Since  7r™_1  is  the  contin- 
uation  of  7Ti  after  the  first  -i[/-state  is  reached,  and  since  we  can  take  n2  to  coincide 
with  the  prosecution  of  7T2  after  that  state  is  reached,  from  (7)  we  obtain 

Pr^>2([OD ]<nU)  =  Prf  ■”(□  U)  +  A  0[0n]<„-i£/))  >  *„(«), 

where  O  and  U  are  the  next-time  and  until  temporal  operators  [MP91],  completing 
the  induction  step.  The  lemma  then  follows  by  taking  the  limit  n  — »  oo,  noting 
that  limn.^oo  xn  =  w  and  lim^-^*,  [<>□]<«  =  OD.  ■ 

The  following  theorem  summarizes  the  results  about  concurrent  Biichi  and  co- 
Biichi  games. 

Theorem  2.  The  following  assertions  hold. 

1.  Concurrent  Biichi  and  co-Biichi  games  can  be  solved  according  to  (4)  and  (5). 

2.  There  are  deterministic  concurrent  Biichi  games  without  optimal  strategies, 
and  without  finite-memory  e-optimal  strategies. 

3.  There  are  deterministic  concurrent  co-Biichi  games  without  optimal  strategies. 

Part  1  follows  from  Lemmas  4  and  6,  and  from  quantitative  game  /<-calculus  com¬ 
plementation.  Part  2  follows  from  the  lack  of  optimal  strategies  for  reachability 
(see  Example  2),  and  from  the  fact  that  Biichi  games  are  equivalent  to  iterated 
reachability  games  (see  [dAHOO]  for  an  example).  Part  3  is  a  consequence  of  the 
lack  of  optimal  strategies  for  concurrent  reachability  games. 

5.  RABIN-CHAIN  GAMES 

A  concurrent  Rabin-chain  game  consists  of  a  concurrent  game  structure  Q  = 
(5,  Moves,  Ti,  T2,p)  together  with  a  winning  condition 

k- 1 

TZ  =  \/  (DO Un  A  -.DO U2i+i)  , 

i= 0 

where  k  >  0  and  0  =  U2k  Q  U2k  i  Q  U2k-2  C  ■  ■  ■  C  XJq  =  S.  A  more  intuitive 
characterization  of  this  winning  condition  can  be  obtained  by  defining,  for  0  <  i  < 
2k  — 1,  the  set  Ci  of  states  of  color  i  by  C\  =  Lf  \  The  total  number  of  colors 
is  N  =  2k.  Given  a  path  s,  let  Infi(s)  C  5  be  the  set  of  states  that  occur  infinitely 
often  along  s,  and  let 

MaxCol(s)  =  max{i  e  {0,, . . ,  N  —  1}  |  Ci  n  Infi(s)  ^  0} 
be  the  largest  color  appearing  infinitely  often  along  the  path.  Then, 

TZ  =  {s  e  fl  |  MaxCoUfs)  is  even}. 
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(8) 


The  solution  (1)7?.  for  a  Rabin-chain  condition  with  N  colors  is  given  by 

JV-l 

{l)n  =  [riN-iXN-1- . .  ./w?i.i/*o.(  V  (Ci  A  Ppre^Xi)))] 

i= 0 

where  r]n  =  v  if  n  is  even,  and  r]n  =  p  if  n  is  odd  (compare  with  [EJ91]).  The  proof  of 
(8)  is  based  on  the  following  inductive  decomposition,  inspired  by  the  one  of  [EJ91]. 
We  denote  by  C<„  =  |J"=o  Ci  (resp.  C>n  =  Ut=„+i  Cj  and  C<n  =  IJ"^1  Ci)  the  set 
of  states  colored  by  colors  less  than  or  equal  to  n  (resp.,  greater  than  n.  and  smaller 
than  n).  Let  z  £  T,  and  for  n  >  0  define  J„  by  J-i(z)  =  z,  and 

Jn(z)  =  VnX.Jn-i((Cn  A  Ppre^x))  V  (C>n  A  z)).  (9) 

We  can  show  by  induction  on  n  that  [J„(z)J  is  the  function  that  gives  the  maximal 
expectation  of  either  winning  the  concurrent  Rabin-chain  game  while  visiting  only 
states  in  C<„,  or  of  the  value  z(OC>n )  if  C<„  is  exited.  Denote  by  [R,  A  □(?<„] 
the  random  function  that  has  value  1  over  a  path  exactly  when  the  path  satisfies 
condition  1Z  while  visiting  only  states  in  C<„.  The  lemma  below  makes  the  above 
characterization  of  J„  precise. 

Lemma  7.  For  all  e  >0,  all  n  e  {0, . . ,  ,  N  —  1},  all  z  e  T ,  and  all  states 
s  €  S,  there  is  a  strategy  7Ti  €  IR  for  player  1  such  that  for  all  strategies  n2  €  IR 
of  player  2,  we  have 

F^{[UADC<n]+z(OC>n)}  >  lJn(z)}(s)  -  e. 

Proof.  To  prove  the  result,  we  first  note  that  for  all  —  1  <  n  <  N  —  1,  all  z  €  F, 
and  all  s  £  C>n ,  we  have 

|J„(^)](s)  =  z(s).  (10) 

This  follows  easily  by  unrolling  (9)  into 

Jn(z)  —  Vn%n - IIXi.VXq.  {(C>n  A  2)  V  (Cn  A  Ppre!  (xn))  V  ■  ■  ■  V  (C0  A  Ppre!  (x0))^ 

and  by  noting  that  Jn(z)  A  C>n  —  z  A  C>n. 

The  lemma  is  proved  by  induction  on  n,  for  —  1  <  n  <  N  —  1.  The  base  case 
for  n  =  —  1  follows  from  (10).  Let  so,£i,£2->  ■  ■  •  >  0  be  such  that  J2T=o£k  <  £-  F°r 
0  <  n  <  N  —  1  there  are  two  cases,  depending  on  whether  n  is  odd  or  even. 

Case  for  n  odd.  If  n  is  odd,  we  have  r]n  =  p.  Let  wo  =  0,  and  for  k  >  0,  let 

wk  =  [Jn-i(Cn  A  Ppre(wk-i)  V  C>n  A  z)]. 

By  induction  on  k.  we  show  that  for  all  k  >  0,  player  1  has  a  strategy  n *  such  that 

Eg1 ,7r2{[7 Z  A  □  C<„]  +  z(OC>n)}  >  wk(s)  —  J2i= o  £« 

for  all  s  £  S  and  7T2  £  IR.  The  base  case,  for  fc  =  0,  is  obvious.  For  k  >  0, 
the  strategy  7if  for  player  1  coincides  with  an  -optimal  strategy  in  the  game 
Jji—i {Cn  A  Ppre(u>fc_i)  V  C>n  A  z )  while  the  game  remains  in  C<„;  when  Cn  is  hit 
for  the  first  time,  it  plays  an  optimal  strategy  in  the  matrix  game  Ppre1(u;fc_i), 
and  thereafter  switches  to  the  inductively  constructed  strategy  7rf  1 .  Define  the 
shorthand 

W  =  [77  A  uC<n\  +  z(OC>n), 

which  represents  winning  while  never  leaving  C<„,  or  reaching  C>n  and  getting 
reward  2.  For  all  s  £  S  and  n2  £  IR,  we  have 
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i=0 


i=0 


where  x  —  y  =  max  {0,  x  —  y}.  The  first  inequality  follows  by  induction  on  n, 
and  by  a  case  analysis  on  the  possible  ways  of  leaving  C<n.  The  strategies  7rf  [r] 
and  7T2  [r]  behave  like  -k\  and  7T2  after  the  path  from  s  to  r.  The  second  inequality 
follows  then  by  an  analysis  of  a  single  step  from  r.  The  strategy  7T2  [r,  t]  is  the 
strategy  that  behaves  as  7T2  after  a  path  from  s  to  r  and  t:  by  definition  of  7if, 
we  have  7rf  [r.  t\  =  7rf  1 .  The  third  inequality  follows  by  induction  hypothesis  on 
k,  remembering  the  definition  of  W.  The  fourth  inequality  follows  by  using  the 
definition  of  Ppre1;  and  the  fifth  inequality  follows  by  pulling  out  the  constant 
from  the  Pprej  and  the  expectation.  This  concludes  the  induction  on  k;  the  result 
follows  by  taking  k  -»  oo. 

Case  for  n  even.  For  even  n.  we  have  =  v,  let 

w  =  \vx.Jn-\(Cn  A  Ppre^x)  V  C>n  A  z)J. 

From  (10)  we  have  that  Cn  A  w  =  Cn  A  Ppre1(tn):  in  other  words,  w  and  Ppre1(tn) 
are  equal  on  Cn.  We  show  that  player  1  has  a  strategy  such  that,  for  all  s  €  5 
and  all  strategies  7r2  of  player  2,  we  have 


E^'^iW}  >  w(s) -e.  (11) 

We  construct  the  strategy  7Ti  as  follows.  In  Crl.  the  strategy  7Ti  plays  according  to 
an  optimal  distribution  for  Ppre1(rc).  In  C<„,  the  strategy  7Ti  plays  according  to  a 
£fc-optimal  strategy  for  Jn-i(Cn  A  Ppre1(u>)  V  C>n  A  z),  where  k  is  the  number  of 
previous  entrances  in  Cn;  this  strategy  is  constructed  by  induction  on  n. 

To  show  (11),  we  construct  a  sequence  of  random  variables  {Tfc}fc>0  that  con¬ 
verges  to  W  as  k  — >■  oo;  intuitively,  the  index  k  represents  the  number  of  visits 
to  Cn.  For  A,B,C  C  5  pairwise  disjoint  and  k  >  0,  we  introduce  the  following 
notation: 
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•  AA<kB  (resp.  AAkB,  AA<kB)  denotes  the  event  of  staying  forever  in  AuB, 
and  visiting  B  fewer  than  k  times  (resp.  k  times,  no  more  than  k  times). 

•  For  a  function  /  :  5  t— >  [0, 1],  the  random  variable  f(AUkB)  has  value  f(sk) 
for  paths  having  a  prefix  of  the  form  ooSoO'iSi  •  •  -aksk,  where  for  0  <  i  <  k 
we  have  £  A*  and  s*  £  B,  and  has  value  0  for  paths  that  do  not  start  with 
a  prefix  of  this  form. 

•  For  a  function  /  :  S  t->  [0, 1]  and  Xg  {<,<,=},  the  random  variable 
f((AA^kB)  U  C )  has  value  f(t)  for  paths  having  a  prefix  of  the  form 
(ToSoo'i'Si  •  •  •  <JjSjOj+\t,  where  j  X  k  and  where  for  0  <  i  <  j  we  have  £  Cn, 
for  0  <  i  <  j  +  1  we  have  a*  £  C<„,  and  where  t  £  C>„;  the  random  variable 
has  value  0  for  paths  that  do  not  start  with  a  prefix  of  this  form. 

•  For  a  function  /  :S4  [0,1],  the  random  variable  f(AUB)  has  value  f(t), 
for  paths  having  a  prefix  of  the  form  at  with  a  £  A*  and  t  £  B,  and  has 
value  0  for  paths  that  do  not  start  with  a  prefix  of  this  form. 

Finally,  for  k  >  0  we  define  the  random  variable  Tk  by: 

Tk  =  [UA  C<^n^<k  Cn]  +  w(C<n  UkCn )  +  z((C<nA<kCn)  UC>n), 

where  for  a  predicate  p,  we  denote  by  [p]  the  random  variable  that  has  value  1  on 
the  paths  that  satisfy  p,  and  value  0  on  the  paths  that  do  not  satisfy  p.  We  prove 
that  for  all  k  >  0  we  have 


k- 1 

E nsunHTk}  >  AtoV  C>n  A  z)j(s)  -  (12) 

i= 0 

for  all  strategies  n2  of  player  2  and  all  s  £  S.  Note  that  (11)  follows  from  (12)  by 
taking  k  — >  oo:  in  fact,  linifc-^x,  E ^1,7T2{Tk}  =  W  and  Jn-i(Cn  A  roV  C>n  A  z)  =  w. 
To  prove  (12),  we  proceed  by  induction  on  k.  The  base  case,  for  k  =  0,  follows  from 

To  =  [R.  A  □C<n]  +  w(C<n  U  Cn )  +  z(C<n  ti  C>n) 

=  [H/\  □  C<n]  +  (w  A  Cn  V  z  A  C>„)(0<7>„), 

and  from  the  induction  hypothesis  on  n.  As  induction  hypothesis,  we  assume  that 
(12)  holds  for  k,  or, 

k- 1 

BAs1’^2^JZ/\C<nA<kCn]-\-w(C<nUkCn)+z{(C<nA<kCn)  U C>n) |  >  w(s )  —  £»■ 

i= o 
(13) 

Moreover,  by  construction  of  the  strategy  7Ti ,  for  all  strategies  of  player  2  and 
al \  t  £  S  we  have  that 

A  DC<n]  +  w(C<n  U  Cn)  +  z(C<nUC>n)}  >  w(t )  -  sk,  (14) 

where  7T*  is  the  strategy  that  coincides  with  m  after  any  path  that  contains  k  visits 
to  Cn.  Using  the  bound  for  w(t)  provided  by  (14)  for  the  term  w(C<n  UkCn )  of 
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(13),  and  taking  into  account  the  prefix  C<nAkCn  that  precedes  state  t  in  (13),  we 
obtain: 

EJ1’*a{[7e  A  C<nA<kCn]  +  z((C<nA<kCn)  U  C>n)  +  [K  A  C<n  UkCn ] 

k 

+  w{C<n  Uk+lCn)  +  z((C<nAkCn )  U  C>n))  >  w(s)  —  Y  £j, 

i= 0 


and  by  gathering  the  terms, 

E?1’*3  { [K  A  C<„A<*+1C„]  +  w(C<n  Uk+lC„)  +  z((C<nA  + 1  Cfl)  UC>n )} 

k 

>  w(s)  -  yet, 

i= 0 

which  concludes  the  induction  on  k  (compare  with  (13)).  This  proves  (12),  and 
hence  (11)  and  the  lemma.  I 

The  value  of  the  game  with  condition  TZ  is  then  [Jjv-i(O)].  Both  the  lower  and  the 
upper  bounds  for  the  value  of  the  game  follow  from  the  lemma,  because  Rabin-chain 
games  are  self-dual  (the  complement  of  a  concurrent  Rabin-chain  game  is  again  a 
concurrent  Rabin-chain  game).  We  can  now  summarize  the  results  for  concurrent 
Rabin-chain  games. 

Theorem  3.  The  following  assertions  hold. 

1.  Concurrent  Rabin-chain  games  can  be  solved  according  to  (8). 

2.  There  are  deterministic  concurrent  Rabin-chain  games  without  optimal  strate¬ 
gies  and  without  finite-memory  s-optimal  strategies. 

Part  1  follows  from  Lemma  7.  Again,  the  lack  of  optimal  strategies,  and  of  finite 
memory  e-optimal  strategies  follows  from  the  result  proved  for  Biichi  games  (which 
are  special  cases). 

Finally,  the  next  theorem  states  that  if  the  state  space  is  countable,  rather  than 
finite,  the  quantitative  game  /i-calculus  solutions  presented  in  this  paper  still  define 
the  value  of  the  game. 

Theorem  4.  Consider  a  concurrent  game  structure  Q  =  (5,  Moves,  T\,T2,p), 
where  S  is  countable.  Then,  formulas  (2),  (3),  (4),  (5),  and  (8)  provide  the  solu¬ 
tions  for  concurrent  reachability,  safety,  Biichi,  co-Biichi,  and  Rabin-chain  games, 
respectively. 

This  theorem  can  be  proved  by  the  same  arguments  used  for  finite  concurrent  games, 
using  transfinite  induction  rather  than  ordinary  induction  when  arguing  about  the 
least  and  greatest  fixpoints  of  the  calculus. 

6.  ALGORITHMS 

Example  1  shows  that  the  value  of  a  game  can  be  irrational,  hence  the  iterative 
schemes  may  not  terminate  in  general.  Thus,  in  general,  we  can  only  hope  for 
e-approximations  of  the  value.  We  give  an  algorithm  to  estimate  the  value  of  a 
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Rabin-chain  game  to  a  given  tolerance  e,  that  is,  we  give  a  decision  procedure  for 
the  question:  given  a  game  Q,  a  state  s  of  Q,  a  Rabin-chain  property  <p,  a  rational 
r  £  [0,1],  and  a  rational  tolerance  e  >  0,  is  the  value  |[y>]g(s)  —  r\  <  e?  The 
algorithm  is  based  on  the  observation  that  the  value  of  a  Rabin-chain  game  can  be 
expressed  as  an  elementary  formula  in  the  theory  of  real  closed  fields,  and  uses  a 
decision  procedure  for  the  theory  of  reals  with  addition  and  multiplication  [Tar51]. 
We  start  with  some  basic  definitions. 

An  ordered  field  H  is  real-closed  if  no  proper  algebraic  extension  of  H  is  ordered. 
We  denote  by  R  the  real-closed  field  (1R,  +,  ■,  0, 1,  <)  of  the  reals  with  addition  and 
multiplication.  An  atomic  formula  is  an  expression  of  the  form  p  >  0  or  p  =  0  where 
p  is  a  (possibly)  multi-variate  polynomial  with  integer  coefficients.  An  elementary 
formula  is  constructed  from  atomic  formulas  by  the  grammar 

<p  ::=  a  \  ~^ip  \  <p  f\<p\  <pV  <p\  3x.p  |  Vx.y?, 

where  a  is  an  atomic  formula,  A  denotes  conjunction,  V  denotes  disjunction,  -> 
denotes  complementation,  and  3  and  V  denote  existential  and  universal  quantifi¬ 
cation  respectively.  The  semantics  of  elementary  formulas  are  given  in  a  standard 
way  [CK90].  A  variable  x  is  free  in  the  formula  ip  if  it  is  not  in  the  scope  of  a 
quantifier  3x  or  Vx.  An  elementary  sentence  is  a  formula  with  no  free  variables.  A 
famous  theorem  of  Tarski  states  that  the  theory  of  real-closed  fields  is  decidable. 

Theorem  5.  [Tar51]  The  theory  of  real-closed  fields  in  the  language  of  ordered 
fields  is  decidable. 

We  start  with  the  following  classical  observation  [Wey50]  that  the  minmax  value 
can  be  written  as  an  elementary  formula  in  the  theorey  of  ordered  fields.  We  include 
a  proof  for  completeness. 

Lemma  8.  Let  A  =  (oy)  be  a  matrix  with  entries  in  the  ordered  field  H .  Then 
the  statement  y  =  vaR  A  can  be  written  as  an  elementary  formula  over  H. 

Proof  Let  A  be  an  to  x  n  matrix  (that  is,  suppose  player  1  has  m  moves  and 
player  2  has  n  moves).  Then  r  =  vaRA  iff  there  exists  (xi,...,xm)  £  Hm  and 

(j/i, . . . ,  yn)  £  Hn  such  that  x,  >  0  for  all  i  =  1 , ,  m  and  Y^Li  xi  —  R  and 

similarly  yi  >  0  for  all  *  =  1 , ... ,  n  and  Y^=  i  J/i  =  1 ;  and  such  that  Y^iL\  aijxi  >  r 
for  all  j  =  1, . . . ,  n  and  ]T)[Li  a^yj  <  r  for  alii  =  1, . . . ,  m.  This  can  be  written  as 
an  elementary  formula  over  H.  I 

Let  y  denote  a  vector  of  n  variables  yi,...,yn.  For  ~€  {=,<,>},  we  write 
x  ~  y  for  the  pointwise  ordering,  that  is,  if  /\t  x*  ~  yi.  An  immediate  consequence 
of  Lemma  8  is  the  following. 

Corollary  1.  Let  Q  be  a  concurrent  game  structure  over  the  state  space  S. 
Let  f  €  T.  Then  for  any  state  s  £  S,  the  statement  y  =  Ppre1(f)  can  be  written 
an  elementary  formula  over  R  with  free  variables  in  y.  Let  y  and  x  be  vectors  of 
n  variables.  Then  y  =  Ppre1  (x)  can  be  written  as  an  elementary  formula  over  R 
with  free  variables  in  x  and  y. 

We  denote  the  fth  coordinate  of  the  vector  y  as  yi,  we  denote  the  ith  coordinate 
of  the  vector  Ppre1(x)  as  Ppre1(x)(i).  Using  the  corollary,  we  can  now  express 
solution  formulas  for  reachability,  safety,  Biichi,  co-Biichi,  and  Rabin-chain  games 
as  elementary  formulas  in  the  theory  of  real-closed  fields. 


19 


Lemma  9.  Let  Q  be  a  concurrent  game  structure  and  s  a  state  in  Q.  Let  ^  be  a 
reachability,  safety,  Biichi,  co-Biichi,  or  Rabin-chain  condition,  and  let  (l)'J'  =  [<*?]. 
The  statement  y  =  [</?]  can  be  written  as  an  elementary  formula  in  the  theory  of 
real  closed  fields. 

We  start  with  reachability  games.  The  solution  of  a  reachability  game  is  the 
least  solution  of  the  fixpoint  equation  given  by  (2).  Suppose  the  set  of  states  is 
S  =  {1, . . . ,  n}.  We  have  n  variables  yi. ,  yn  corresponding  to  the  n  states. 

f\  yi  =  1  A  f\  yt  =  Pr e(y)(i)  A  V£.(  /\  xt  =  1  A  f\  xt  =  Pre(£)(i))  y  <  x 
ieu  i£U  ieU  i$U 

The  first  part  of  the  formula  (the  first  two  conjuncts)  states  that  y  is  a  fixpoint,  the 
second  part  states  that  it  is  the  least  fixpoint.  We  now  give  the  formula  for  Biichi 
games.  The  formula  states  that  the  solution  y  of  a  Biichi  game  with  goal  DO?7  is 
the  largest  solution  of  the  equation  system  y  =  x,  where  x  is  the  least  solution  to 
the  fixpoint  equation  x  =  (~<U  A  Ppre1(x))  V  (U  A  Ppre1(y)).  Formally,  we  define 
the  formula  in  stages,  as  follows.  Let 

Fo(x,  y)  :=  /\Xi=  Ppre^y)^)  A  /\  xt  =  Ppre^f)^) 
ieu  i£U 

Fi  (y)  :=  3 x.(y  =  x)  A  F0(x,  y)  A  ( \/x'.F0(x‘ ',  y)  =>  x  <  x'); 


then 

Fi(y)  A  (^y'.Fiiy')  =>  y'  <  y) 

gives  an  elementary  formula  with  free  variables  y  that  denote  the  value  of  the 
Biichi  game  from  each  state.  Finally,  we  get  the  value  of  the  game  from  state  1  by 
existentially  quantifying  all  free  variables  other  than  y\.  The  formulas  for  safety 
and  co-Biichi  games  are  analogous. 

The  general  formula  for  Rabin-chain  games  can  be  written  similarly  by  unrolling 

the  fixpoints.  For  the  formula  t)n- \Xn-i - HXi.vxo.^^Lq  1(Cj  A  Ppre1(xj)))  we 

proceed  inside  out,  starting  at  the  innermost  variable.  Let 

iV-l 

F0(xN- fi)  :=  /\  /\  x0i  =  Ppre1(ffe)(i), 
k=0  ieck 

and  for  j  £  {1, ...  N  -  1},  let 
Fj(xN- xj)  := 

3  =  Xj-i )  A  i(xjv-i,  •  ■  -,Xj,  f'_1)  =>  fj_i  ~ 

where  ~  is  <  if  j  —  1  is  odd  (corresponding  to  a  least  fixpoint) ,  and  ~  is  >  if  j  —  1 
is  even  (corresponding  to  a  greatest  fixpoint).  Finally,  the  solution  formula  is  given 

by 

Fn- i(xjv-i)  A  (Vx(v_1.Fjv-i(x(v_1)  =>  fjv-i  ~  x'jv-i) 

(in  terms  of  the  free  variables  xjv-i)  where  ~  is  >  if  N  —  1  is  even,  and  ~  is  <  if 
N  —  1  is  odd.  The  size  of  the  resulting  formula  is  linear  in  n  (the  size  of  the  state 
space)  and  exponential  in  N  (the  number  of  colors). 

An  algorithm  that  approximates  the  value  to  within  a  tolerance  e  is  now  obtained 
by  binary  search.  In  particular,  we  first  ask  the  question  3y.y  =  [</p](s)  A  y  >  \. 
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FIG.  1:  A  game  that  disproves  the  reduction  to  reachability.  A  label  (a,  b)  of  an 
edge  (or  of  a  probabilistic  bundle  of  edges)  indicates  that  the  edge  is  followed  when 
player  1  chooses  move  a  and  player  2  chooses  move  b. 


If  the  answer  is  yes,  we  continue  the  search  in  the  subinterval  [|,  1],  otherwise  we 
restrict  to  [0,  |].  In  this  way,  after  log  ^  steps,  we  can  approximate  the  value  to 
within  e. 

Note  also  that  the  characterization  of  the  winning  value  as  elementary  formulas 
over  R  is  valid  only  if  the  state  space  is  finite.  In  particular,  for  each  real  r  €  [0, 1], 
we  can  construct  a  reachability  game  Q  over  a  countable  state  space  5,  and  a  state 
s  €  5  such  that  the  value  of  the  reachability  game  at  s  is  r.  This  shows  that  for 
countable  games,  the  solution  need  not  be  algebraic.  The  game  Q  is  constructed  as 
follows.  Write  the  binary  expansion  of  r  (for  rational  r,  there  are  more  than  one 
expansions,  so  choose  any  one  arbitrarily).  The  players  have  no  choice  of  moves 
in  the  game:  at  each  state,  only  one  move  is  available  to  each  player.  In  the  fcth 
stage  Sk,  player  1  has  one  move  that  takes  the  game  to  the  k  +  1st  stage  with 
probability  |,  and  takes  the  game  to  t k  with  probability  |.  The  state  4  is  winning 
if  the  kth  bit  in  the  binary  expansion  is  a  1,  and  tk  is  losing  otherwise.  Each  tk 
is  a  sink:  once  the  game  reaches  tk,  it  cannot  proceed  to  any  other  state.  The 
reachability  objective  U  is  to  reach  a  tk  that  is  winning.  Clearly,  (l)0[/(si)  =  r. 

7.  DISCUSSION 

The  solution  formulas  for  concurrent  games  that  have  been  presented  in  this  pa¬ 
per  lead  to  algorithms  for  the  computation  of  approximate  solutions  of  the  games. 
In  the  case  of  safety  and  reachability  games,  the  solution  formulas  (2)  and  (3) 
contain  a  single  fixpoint  operator.  By  computing  these  fixpoints  in  iterative  fash¬ 
ion,  we  obtain  approximation  schemes  that  converge  monotonically  to  the  solu¬ 
tion.  The  speed  of  convergence  of  such  schemes  has  not  been  characterized.  On 
the  other  hand,  for  Biichi,  co-Buchi,  and  Rabin-chain  games,  the  alternation  of 
fixpoint  operators  in  the  solution  formulas  yields  approximation  schemes  that  con¬ 
tain  nested  iterations,  and  it  is  not  known  how  to  obtain  monotonically  converg¬ 
ing  approximation  schemes.  Specifically,  from  (8)  the  solution  of  a  Rabin-chain 

game  with  2k  colors  has  the  fixpoint  prefix  fix 2k-i-PX2k-2 _ /ixi.pxq.  Denote  by 

w(n 2*_i ,  «2*— 2 j  ■  ■  ■  ,no)  the  approximation  of  (8)  computed  by  approximating  the 
fixpoint  r]Xi  by  n*  iterations,  for  i  €  {0, ...  ,2k  —  1}.  It  is  not  known  how  to  select 
a  sequence 


fn(0) 
\n2  k- 


,(0) 


1> '  1 


1  fn(1)  n(1)l  (n{2) 

)’Vl2k-V*--’n0  )i\n2k  — 1’ 


n{2)) 

1  >  n0  )i 
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such  that  for  al  lie  {0, . . . ,  2k  —  1}  we  have  limj-^x,  n[p  =  oo,  and  such  that 

lim  w{n$l_v  . .  .,n\f])  =  { 1)71 

j->oo 

with  monotonic  convergence. 

This  situation  is  in  contrast  with  the  situation  for  Markov  decision  processes. 
In  a  Markov  decision  process,  the  problem  of  computing  the  maximal  probability  of 
satisfying  a  Biichi,  co-Buchi,  or  Rabin-chain  condition  'I'  can  be  solved  in  polyno¬ 
mial  time,  by  reducing  it  to  the  problem  of  computing  a  maximal  reachability  proba¬ 
bility  [CY90].  From  U>,  we  can  first  compute  the  subset  Tp  =  {s  £  S  |  {l)\H(s)  =  1} 
of  states  where  the  maximal  probability  of  4/  is  1.  Then,  we  have  (1)11/  =  (l)OTp, 
indicating  that  the  maximal  probability  of  satisfying  is  equal  to  the  maximal 
probability  of  reaching  Tp.  In  concurrent  games,  given  a  Biichi,  co-Buchi,  or  Rabin- 
chain  condition  U>,  we  can  compute  the  set  Tip  with  the  algorithms  of  [dAHOO], 
setting  Tp  =  ((1  ))umit^-  If  the  equality  (1)4/  =  (l)OTp  held  for  concurrent  games, 
it  would  provide  monotonic  approximation  schemes  for  computing  the  value  of  the 
game  (the  problem  would  still  not  be  reducible  to  linear  programming,  since  the 
values  may  be  irrational,  as  mentioned  earlier).  However,  the  following  example 
demonstrates  that  the  equality  does  not  hold  for  games. 

Example  3.  Consider  the  game  depicted  in  Figure  1.  Let  U  =  {t\,t2,t4), 
and  consider  the  co-Buchi  winning  condition  ODD.  The  set  of  states  i?i  (resp.  f?2 ) 
where  player  1  (resp.  2)  can  ensure  winning  (resp.  losing)  with  probability  1  are 
given  by 


=  T<yuu  =  {«  €  5  |  (l)OOU(s)  =  1}  =  {ti} 

R2={s£S\  (2)dO-'I/(s)  =  1}  =  {f4,f6}. 

For  i  £  {1,  2},  the  maximal  probability  for  player  i  of  reaching  Ri  from  outside  Ri 
is  zero:  (l)O-Ri(f*)  =  0  for  k  ^  1,  and  {2)oR2[tk)  —  0  for  k  #  {4,  5}.  Nevertheless, 
we  can  verify  that  {l)<>nU(t2)  =  2/3,  and  (l)ODl7(t3)  =  1/3.  I 

REFERENCES 

[ASB+95]  A.  Aziz,  V.  Singhal,  F.  Balarin,  R.K.  Brayton,  and  A.L.  Sangiovanni- 
Vincentelli.  It  usually  works:  The  temporal  logic  of  stochastic  systems. 
In  Computer  Aided  Verification,  volume  939  of  Lect.  Notes  in  Comp.  Sci. 
Springer- Verlag,  1995. 

A.  Bianco  and  L.  de  Alfaro.  Model  checking  of  probabilistic  and  nondeter- 
ministic  systems.  In  P.  S.  Thiagarajan,  editor,  Found,  of  Software  Tech, 
and  Theor.  Comp.  Sci.,  volume  1026  of  Lect.  Notes  in  Comp.  Sci.,  pages 
499-513.  Springer- Verlag,  1995. 

J.R.  Biichi  and  L.H.  Landweber.  Solving  sequential  conditions  by  finite- 
state  strategies.  Trans.  Amer.  Math.  Soc.,  138:295-311,  1969. 

[BMCD90]  J.R.  Burch,  K.L.  McMillan,  E.M.  Clarke,  and  D.L.  Dill.  Sequential 
circuit  verification  using  symbolic  model  checking.  In  Proc.  of  the  21th 
ACM/IEEE  Design  Automation  Conference,  pages  46-51,  Orlando,  FL, 
USA,  June  1990. 


[BdA95] 


[BL69] 


22 


[CK90]  C.C.  Chang  and  H.J.  Keisler.  Model  Theory.  North  Holland,  3rd  edition 
edition,  1990. 

[CY90]  C.  Courcoubetis  and  M.  Yannakakis.  Markov  decision  processes  and  reg¬ 
ular  events.  In  Proc.  17th  Int.  Colloq.  Aut.  Lang.  Prog.,  volume  443  of 
Lect.  Notes  in  Comp.  Sci.,  pages  336-349.  Springer- Verlag,  1990. 

[dAHOO]  L.  de  Alfaro  and  T.A.  Henzinger.  Concurrent  omega-regular  games.  In 
Proc.  15th  IEEE  Symp.  Logic  in  Comp.  Sci.,  pages  141-154,  Santa  Bar¬ 
bara,  California,  USA,  2000. 

[dAHK98]  L.  de  Alfaro,  T.A.  Henzinger,  and  O.  Kupferman.  Concurrent  reach¬ 
ability  games.  In  Proc.  39th  IEEE  Symp.  Found,  of  Comp.  Sci.,  pages 
564-575.  IEEE  Computer  Society  Press,  1998. 

[dAKN+00]  L.  de  Alfaro,  M.  Kwiatkowska,  G.  Norman,  D.  Parker,  and  R.  Segala. 

Symbolic  model  checking  of  concurrent  probabilistic  processes  using 
MTBDDs  and  the  Kronecker  representation.  In  TA CAS:  Tools  and  Al¬ 
gorithms  for  the  Construction  and  Analysis  of  Systems,  volume  1785  of 
Lect.  Notes  in  Comp.  Sci.,  pages  395-410.  Springer- Verlag,  2000. 

[EJ91]  E.A.  Emerson  and  C.S.  Jutla.  Tree  automata,  mu-calculus  and  determi- 
nacy  (extended  abstract).  In  Proc.  32nd  IEEE  Symp.  Found,  of  Comp. 
Sci.,  pages  368-377.  IEEE  Computer  Society  Press,  1991. 

[Eve57]  H.  Everett.  Recursive  games.  In  Contributions  to  the  Theory  of  Games 
III,  volume  39  of  Annals  of  Mathematical  Studies,  pages  47-78,  1957. 

[FV97]  J.  Filar  and  K.  Vrieze.  Competitive  Markov  Decision  Processes.  Springer- 
Verlag,  1997. 

[GH82]  Y.  Gurevich  and  L.  Harrington.  Trees,  automata,  and  games.  In  Proc. 
14th  ACM  Symp.  Theory  of  Comp.,  pages  60-65.  ACM  Press,  1982. 

[HK97]  M.  Huth  and  M.  Kwiatkowska.  Quantitative  analysis  and  model  checking. 

In  Proc.  12th  IEEE  Symp.  Logic  in  Comp.  Sci.,  pages  111-122,  Warsaw, 
Poland,  1997. 

[Koz83a]  D.  Kozen.  A  probabilistic  PDL.  In  Proc.  15th  ACM  Symp.  Theory  of 
Comp.,  pages  291-297,  Boston,  Massachusetts,  USA,  1983. 

[Koz83b]  D.  Kozen.  Results  on  the  propositional  yu-calculus.  Theoretical  Computer 
Science,  27(3):333-354,  1983. 

[KS81]  P.R.  Kumar  and  T.H.  Shiau.  Existence  of  value  and  randomized  strategies 
in  zero-sum  discrete-time  stochastic  dynamic  games.  SIAM  J.  Control  and 
Optimization,  19(5):617-634,  1981. 

[Mar98]  D. A.  Martin.  The  determinacy  of  Blackwell  games.  The  Journal  of  Sym¬ 
bolic  Logic,  63(4):1565-1581,  1998. 

[McI98]  A.  Mclver.  Reasoning  about  efficiency  within  a  probabilitic  /i-calculus. 

In  Proc.  of  PROBMIV,  pages  45-58,  1998.  Technical  Report  CSR-98-4, 
University  of  Birmingham,  School  of  Computer  Science. 


23 


[McN93l  R.  McNaughton.  Infinite  games  played  on  finite  graphs.  Ann.  Pure  Appl. 
Logic,  65:149-184,  1993. 

[MM01]  A.  Mclver  and  C.  Morgan.  Demonic,  angelic  and  unbounded  probabilistic 
choices  in  sequential  programs.  Acta  Informatica,  37(4/5):329-354,  2001. 

[MMS96]  C.  Morgan,  A.  Mclver,  and  K.  Seidel.  Probabilistic  predicate  transform¬ 
ers.  ACM  Trans.  Prog.  Lang.  Sys.,  18(3):325-353,  1996. 

[Mos84]  A.W.  Mostowski.  Regular  expressions  for  infinite  trees  and  a  standard 
form  of  automata.  In  Computation  Theory ,  volume  208  of  Led.  Notes  in 
Comp.  Sci.,  pages  157-168.  Springer- Verlag,  1984. 

[MP91]  Z.  Manna  and  A.  Pnueli.  The  Temporal  Logic  of  Reactive  and  Concurrent 
Systems:  Specification.  Springer- Verlag,  New  York,  1991. 

[Owe95]  G.  Owen.  Game  Theory.  Academic  Press,  1995. 

[RF91]  T.E.S.  Raghavan  and  J.A.  Filar.  Algorithms  for  stochastic  games  —  a 
survey.  ZOR  —  Methods  and  Models  of  Op.  Res.,  35:437-472,  1991. 

[Saf88]  S.  Safra.  On  the  complexity  of  w-automata.  In  Proc.  29th  IEEE  Symp. 

Found,  of  Comp.  Sci.,  pages  319-327.  IEEE  Computer  Society  Press, 
1988. 

[Saf92]  S.  Safra.  Exponential  determinization  for  w-automata  with  strong-fairness 
acceptance  condition.  In  Proc.  24th  ACM  Symp.  Theory  of  Comp.,  pages 
275-282,  Victoria,  British  Columbia,  Canada,  1992. 

[Sha53]  L.S.  Shapley.  Stochastic  games.  Proc.  Nat.  Acad.  Sci.  USA,  39:1095-1100, 
1953. 

[Tar51]  A.  Tarski.  A  Decision  Method  for  Elementary  Algebra  and  Geometry. 
University  of  California  Press,  Berkeley  and  Los  Angeles,  1951. 

[Tho95]  W.  Thomas.  On  the  synthesis  of  strategies  in  infinite  games.  In  Proc.  of 
12th  Annual  Symp.  on  Theor.  Asp.  of  Comp.  Sci.,  volume  900  of  Led. 
Notes  in  Comp.  Sci.,  pages  1-13.  Springer- Verlag,  1995. 

[TV87]  F.  Thuijsman  and  O.J.  Vrieze.  The  bad  match,  a  total  reward  stochastic 
game.  Operations  Research  Spektrum,  9:93-99,  1987. 

[Var85]  M.Y.  Vardi.  Automatic  verification  of  probabilistic  concurrent  finite-state 
systems.  In  Proc.  26th.  IEEE  Symp.  Found,  of  Comp.  Sci.,  pages  327-338. 
IEEE  Computer  Society  Press,  1985. 

[vNM47]  J.  von  Neumann  and  O.  Morgenstern.  Theory  of  games  and  economic 
behavior.  Princeton  University  Press,  1947. 

[VW94]  M.Y.  Vardi  and  P.  Wolper.  Reasoning  about  infinite  computations.  In¬ 
formation  and  Computation,  1 15(1) :1  37,  1994. 

[Wey50]  H.  Weyl.  Elementary  proof  of  a  minmax  theorem  due  to  von  Neumann.  In 
Contributions  to  the  Theory  of  Games,  I ,  volume  24  of  Annals  of  Mathe¬ 
matical  Studies,  pages  19-25.  Princeton  University  Press,  1950. 

[Wil91]  D.  Williams.  Probability  With  Martingales.  Cambridge  University  Press, 
1991. 


24 


